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I, Corey M. Crafton, declare as follows: 

1 . I have personal knowledge of the information contained herein. 

2. I have over 10 years of experience with Archer-Daniels-Midland Company, 
including 6 years as a molecular biologist. My technical focus is on bacteria. I am also a 
registered patent agent. 

3. I am a co-inventor of the subject matter claimed in U.S. Patent Application No. 
09/978,763 ("the 763 application"), and as such I am familiar with the subject matter presented 
therein. I am also familiar with the prosecution of the 763 application. I have read and am 
familiar with the contents of the book excerpts and journal articles cited in this Declaration. 

4. As one skilled in the art of molecular biology in general and bacterial engineering 
in particular, I recognize the utility of the invention described and claimed in the 763 application. 
I recognize that the invention as claimed has a specific and substantial utility, based at least on 
the factors discussed below. 

5. One of ordinary skill in the art knows that a promoter is a nucleotide sequence that 
is recognized by RNA polymerase molecules which start RNA synthesis and that it is located 
immediately upstream of a gene. As explained in more detail in Devlin, T., Textbook of 
Biochemistry with Clinical Correlations, 689-696 (1997), a promoter consists of two highly 
conserved sequences: the -10 sequence (Pribnow box) and the -35 sequence. As stated in 



Page 1 of 4 



BEST AVAILABLE COPY 



Freifelder, D., Molecular Biology: A Comprehensive Introduction to Prokaryotes and 
Eukaryotes, 375-379 (1983), page 377, "All sequences found in Pribnow boxes are considered to 
be variants of the basic sequence TATAATG. The underscored T, at base 6 in the Pribnow box 
... is present in all promoters sequenced to date." Figure 16.1 1 from Devlin, supra, page 690 
shows these conserved sequences in many known E.Coli promoters. 

When the Pribnow Box of SEQ ID NO 7 of the present invention is aligned into Figure 
16.11 of Devlin, it is noted that only base 3 (C) is different from the most generally conserved 
sequence which has a T in the base 3 location. The most active promoters fit the consensus 
sequence most closely. The bases flanking the -10 and -35 sequences are only weakly conserved. 
Thus, the skilled person would ordinarily expect SEQ ID NO 7 to function as a promoter. 

My project was to isolate several promoter regions from the Corynebacteria glutamicum 
lysine-producing strain. From research that had been done on promoters in E.Coli, a list of 
known E. Coli promoter sequences was assembled. The promoter upstream of the lactate 
dehydrogenase gene, ldh, was one these. From the professionally annotated complete genome 
sequence of Corynebacteria glutamicum, I located the genetic sequence that been annotated as 
the ldh gene. This annotation had been done by a professional organization that compared the 
Corynebacteria genome with publicly known and available genetic sequences from other 
organisms. The area of the Coryne genome that had the highest sequence identity to the known 
E. Coli ldh genetic sequence was therefore annotated as the Coryne ldh genetic sequence. At the 
time of this invention, the ldh promoter from Corynebacteria glutamicum had not been identified 
or annotated. I designed PCR primers to isolate a 500 bp fragment upstream of the annotated ldh 
gene since the ldh promoter should be upstream of the ldh coding region and should be between 
20-200 bp long. As stated in Freifelder supra, page 375, "The first step in transcription is 
binding RNA polymerase to a DNA molecule. Binding occurs at particular sites called 
promoters, which are specific sequences of 20-200 bases at which several interactions occur." 
However, as one skilled in the art of molecular biology knows, 20-200 bp pieces of DNA are 
somewhat difficult to work with because of their small size. In order to make isolation and 
cloning steps easier, I designed the PCR primers to amplify a 500 bp piece. A 500 bp piece is 
large enough to ensure definite capture of the entire promoter region and an easy isolation from 
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an electrophoresis gel. As shown in the specification, a 500 bp PCR product was amplified for 
all potential promoter regions. 

After isolation, the 500 bp piece was cloned into a screening vector to test for promoter 
activity. 

6. Promoter utility is also shown by the P-galactosidase activity discussed in 
Example 9 of the 763 application: "Increased expression of beta-galactosidase under the 
transcriptional control of these transcriptional regulatory regions is shown in Table 9." 
(Paragraph [0202]). Based on my knowledge as one skilled in the art who has reviewed the data 
presented in Table 9 and throughout the specification, I would recognize that this increased 
activity is indicative of promoter activity because increased P-galactosidase activity is a 
conventionally used indicator of promoter activity in bacteria and fungi. Use of p-galactosidase 
activity as an indicator of promoter activity is discussed in, for example, Scanlan, D.J., et ai, 
"Construction of lacZ promoter probe vectors for use in Synechococcus: application to the 
identification of C0 2 -regulated promoters," Gene, 90 (1990) 43-49; and Meyers, A.M., et al, 
"Yeast shuttle and integrative vectors with multiple cloning sites suitable for construction oflacZ 
fusions," Gene, 45 (1986) 299-310. 

7. Based on its sequence and on the functional data in Table 2, the regulator 
presented in SEQ ED NO: 7 includes the nucleotide sequence TACAATG in the -10 position (the 
"Pribnow Box") relative to the nucleotide sequence TTGCCAGGC in the -35 position. The 
Pribnow box in SEQ ID NO: 7 varies from the standard Pribnow box by only a single nucleotide 
(C instead of T at base 3), and includes the definitive T nucleotide at the base six position. When 
this element is positioned upstream of beta-galactosidase, the expression thereof is proof of 
promoter function and hence utility. 

8. In fact, in a more recent sequence search using Genbank 
(http://www.ncbi.nlm.gov/blast), five highly conserved sequences were found. All five were 
Corynebacteria glutamicum sequences. One of these sequences (AB 191244) is publicly 
annotated as the ldh promoter region. This sequence was submitted to Genbank on Mar 29, 
2005. At the time of this invention, this sequence was not known or publicly available. 
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I hereby declare that all statements made herein of my knowledge are true and that all 
statements made on information and belief are believed to be true; and further that these 
statements were made with the knowledge that willful false statements and the like, so made, are 
punishable by fine or imprisonment, or both, under § 1001 of Title 18 of the United States Code 
and that such willful false statements may jeopardize the validity of any patents issuing from the 
present application. 
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Exhibit A 



BLASTN 2.2.13 [Nov-27-2005] 

Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. SchABffer, 

Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman 

(1997), "Gapped BLAST and PSI-BLAST: a new generation of 

protein database search programs", Nucleic Acids Res. 25:3389-3402. 

RID: 1141056543-3021-6169699787. BLASTQ1 



Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, 
GSS, environmental samples or phase 0, 1 or 2 HTGS sequences) 
3,742,891 sequences; 16,670,205,594 total letters 

Query= 
Length=500 



Sequences producing significant alignments: 



Score E 
(Bits) Value 



gi| 41326831 |emb|BX927156.l| 
gi j 42602314 | dbj |BA0 0003 6. 3 | 
gij 80973081 |gb|DQ248874 . l| 
gi j 62 08 6196 j dbj | AB191244 . 1 | 
gi j 5042 83 70 j dbj | AB115088 . 1 j 



Corynebacterium glutamicum ATCC 1 . . . 
Corynebacterium glutamicum ATCC 13 03 
Corynebacterium glutamicum L-lacta. . . 
Corynebacterium glutamicum ldhA gene 
Corynebacterium glutamicum ldhA g. . . 



944 
944 
944 
658 
383 



0.0 
0.0 
0.0 
0.0 

4e-103 



ALIGNMENTS 

>gi|4132683l|emb|BX927156.l| Corynebacterium glutamicum ATCC 13032, IS fingerprint 
type 4-5, 

complete genome; segment 9/10 
Length=34 9115 

Features in this part of subject sequence: 
putative membrane protein 

Score = 944 bits (476), Expect = 0.0 
Identities = 494/500 (98%) , Gaps = 0/500 (0%) 
Strand=Plus /Minus 

AAAACAGCCAGGTTAGCGGCTGTAACCCACCACGGTTTCGGCAACAATGACGGCGAGAGA 60 

II 1 1 1 M 1 1 1 1 1 I MM II M M M M M MMM M M M M M M M M MM M 

AAAACAGCCAGGTTAGCAGCCGTAACCCACCACGGTTTCGGCAACAATGACGGCGAGAGA 2 93 55: 
GCCCACCACATTGCGATTTCCGCTCCGATAAAGCCAGCGCCCATATTTGCAGGGAGGATT 12 0 

1 1 1 1 1 1 i i l i 1 1 1 1 1 1 J 1 1 1 1 1 1 1 1 1 J 1 1 1 1 1 1 1 1 1 1 1 1 i i 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 

GCCCACCACATTGCGATTTCCGCTCCGATAAAGCCAGCGCCCATATTTGCAGGGAGGATT 2 934 91 
CGCCTGCGGTTTGGCGACATTCGGATCCCCGGAACCAGCTCTGCAATGACCTGCGCGCCG 180 

MIIIIIIIIIIMMMIIIIIIIIIIIIIIIII IMIIIIIIIIIMIIMMIMI 

CGCCTGCGGTTTGGCGACATTCGGATCCCCGGAACTAGCTCTGCAATGACCTGCGCGCCG 2 934 3: 
AGGGAAGCGAGGTGGGTGGCAGGTTTTAGTGCGGGTTTAAGCGTTGCCAGGCGAGTGGTG 24 0 

Mill MIMIIIIIIIMIIIIIIIIMIMIIIIIIIMIIIMIIIIIIMMIII 



Query 


1 


Sbjct 


293611 


Query 


61 


Sbjct 


293551 


Query 


121 


Sbjct 


293491 


Query 


181 


Sbjct 


293431 







Query 


241 


Sbjct 


293371 


Query 


301 


Sbjct 


293311 


Query 


361 


Sbjct 


293251 


Query 


421 


Sbjct 


293191 


Query 


481 


Sbjct 


293131 



AGCAAAGACGCTAGTCTGGGGAGCGAAACCATATTGAGTCATCTTGGCAGAGCATGCACA 3 00 

1 1 1 1 I II 1 1 1 1 1 1 1 1 1 1 1 i I 1 1 ! I ' 1 1 1 1 1 1 II ! I 1 1 : 1 i 1 1 1 1 1 I i ! 1 1 . M M I M 

AGCAGAGACGCTAGTCTGGGGAGCGAAACCATATTGAGTCATCTTGGCAGAGCATGCACA 293 312 
ATTCTGCAGGGCATAGATTGGTTTTGCTCGATTTACAATGTGATTTTTTCAACAAAAATA 3 60 

I! MM MINIUM 1 1 1 1 1 1 1 1 1! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 

ATTCTGCAGGGCATAGGTTGGTTTTGCTCGATTTACAATGTGATTTTTTCAACAAAAATA 2 93252 
ACACTTGGTCTGACCACATTTTCGGACATAATCGGGCATAATTAAAGGTGTAACAAAGGA 42 0 

II I II 1 1 II I II I II I II I II 1 1 1 II II II 1 1 1 II II II M I II II 1 1 II 1 1 II II 1 1 1 1 

ACACTTGGTCTGACCACATTTTCGGACATAATCGGGCATAATTAAAGGTGTAACAAAGGA 2 93192 
ATCCGGGCACAAGCTCTTGCTGATTTTCTGAGCTGCTTTGTGGGTTGTCCGGTTAGGGAA 4 8 0 

1 1 1 1 MM 1 1 II 1 1 1 1 1 1 1 II II 1 1 II 1 1 II 1 1 1 1 il 1 1 II II il 1 1 1 M 1 1 1 1 II 

ATCCGGGCACAAGCTCTTGCTGATTTTCTGAGCTGCTTTGTGGGTTGTCCGGTTAGGGAA 2 9313 2 
ATCAGGAAGTGGGATCGAAA 500 

MMIMIMM MMII 

ATCAGGAAGTGGGATCGAAA 2 93112 



>gi|42602314 |dbj |BA000036.3 | Corynebacterium glutamicum ATCC 13032 DNA, complete 
genome 

Length=3309401 

Features in this part of subject sequence: 
Hypothetical protein 



Score 



944 bits (476), Expect =0.0 



Identities = 494/500 (98%), Gaps = 0/500 (0%) 
Strand=Plus/Minus 




Query 


1 


AAAACAGCCAGGTTAGCGGCTGTAACCCACCACGGTTTCGGCAACAATGACGGCGAGAGA 

IMIMMIM Mill 1! 1 M 1 1 1 1 Ml 1 II II M 1 1 1 1 II M 1 1 II M 1 1 1 II 

AAAACAGCCAGGTTAGCAGCCGTAACCCACCACGGTTTCGGCAACAATGACGGCGAGAGA 


60 


Sbjct 


3113891 


3113832 


Query 


61 


GCCCACCACATTGCGATTTCCGCTCCGATAAAGCCAGCGCCCATATTTGCAGGGAGGATT 

II 1 1 1 Mill 1 M 1 1 1 1 II 1 1 MM 1 MM 1 1 1 MM 1 II 1 1 II 1 1 M II II Ml 1 II II 

GCCCACCACATTGCGATTTCCGCTCCGATAAAGCCAGCGCCCATATTTGCAGGGAGGATT 


120 


Sbjct 


3113831 


3113772 


Query 


121 


CGCCTGCGGTTTGGCGACATTCGGATCCCCGGAACCAGCTCTGCAATGACCTGCGCGCCG 

1 II 1 M 1 1 II M 1 1 1 1 1 1 1 II 1 1 M 1 1 1 M 1 1 1 II 1 M 1 1 M 1 1 i 1 1 1 1 II II M 

CGCCTGCGGTTTGGCGACATTCGGATCCCCGGAACTAGCTCTGCAATGACCTGCGCGCCG 


180 


Sbjct 


3113771 


3113712 


Query 


181 


AGGGAAGCGAGGTGGGTGGCAGGTTTTAGTGCGGGTTTAAGCGTTGCCAGGCGAGTGGTG 

INN II 1 II M 1 1 1 1 II 1 1 1 II 1 1 1 1 II 1 1 MM 1 M 1 1 II 1 1 1 1 1 1 M II 1 1 M M 

AGGGAGGCGAGGTGGGTGGCAGGTTTTAGTGCGGGTTTAAGCGTTGCCAGGCGAGTGGTG 


240 


Sbjct 


3113711 


3113652 


Query 


241 


AGCAAAGACGCTAGTCTGGGGAGCGAAACCATATTGAGTCATCTTGGCAGAGCATGCACA 

MM III 1 M 1 1 1 1 1 1 1 1 MM 1 1 Ml 1 1 1 1 1 MMII M 1 1 1 II 1 II Ml 1 1 II II 

AGCAGAGACGCTAGTCTGGGGAGCGAAACCATATTGAGTCATCTTGGCAGAGCATGCACA 


300 


Sbjct 


3113651 


3113592 


Query 


301 


ATTCTGCAGGGCATAGATTGGTTTTGCTCGATTTACAATGTGATTTTTTCAACAAAAATA 

1 Mill 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ) M 1 1 1 1 1 1 1 1 M 1 It 1 1 1 

ATTCTGCAGGGCATAGGTTGGTTTTGCTCGATTTACAATGTGATTTTTTCAACAAAAATA 


360 


Sbjct 


3113591 


3113532 



> 




Query 


361 


Sbjct 


3113531 


Query 


421 


Sbjct 


3113471 


Query 


481 


Sbjct 


3113411 



ACACTTGGTCTGACCACATTTTCGGACATAATCGGGCATAATTAAAGGTGTAACAAAGGA 42 0 

1 1 1 [ [ 1 1 j [ ! 1 1 1 1 1 1 1 1 i I ! [ I ! i I ! i 1 1 1 1 1 1 1 1 1! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

ACACTTGGTCTGACCACATTTTCGGACATAATCGGGCATAATTAAAGGTGTAACAAAGGA 3113 4 72 

ATCCGGGCACAAGCTCTTGCTGATTTTCTGAGCTGCTTTGTGGGTTGTCCGGTTAGGGAA 4 80 

I ! 1 1 1 1 [ ! 1 ! 1 1 1 1 1 1 1 1 1 1 1 j I ! 1 1 1 1 1 1 1 1 1 1 1 1 1 

ATCCGGGCACAAGCTCTTGCTGATTTTCTGAGCTGCTTTGTGGGTTGTCCGGTTAGGGAA 3113412 



500 



ATCAGGAAGTGGGATCGAAA 

1 1 I II MM I I 1 1 1 1 1 1 Ml 

ATCAGGAAGTGGGATCGAAA 31133 92 



>gi | 80973081 | gb|DQ248874 . 1 | Corynebacterium glutamicum L-lactate dehydrogenase (ldh) 
and 

pyruvate kinase (pyk) genes, complete cds 
Length=4183 

Score = 944 bits (476), Expect = 0.0 
Identities = 494/500 (98%), Gaps = 0/500 (0%) 
Strand=Plus/Plus 

AAAACAGCCAGGTTAGCGGCTGTAACCCACCACGGTTTCGGCAACAATGACGGCGAGAGA 60 

MMMMMMIMM II II 1 1 II I II 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

AAAACAGCCAGGTTAGCAGCCGTAACCCACCACGGTTTCGGCAACAATGACGGCGAGAGA 162 
GCCCACCACATTGCGATTTCCGCTCCGATAAAGCCAGCGCCCATATTTGCAGGGAGGATT 120 

1 Mill I I j 1 1 1 1 M I M 1 1 11 1 1 i . 1 1 1 1 i MM 1 1 1 1 

GCCCACCACATTGCGATTTCCGCTCCGATAAAGCCAGCGCCCATATTTGCAGGGAGGATT 222 
CGCCTGCGGTTTGGCGACATTCGGATCCCCGGAACCAGCTCTGCAATGACCTGCGCGCCG 18 0 

I II II I Ml I! 1 1 II 1 1 1 III II II 1 1 1 1 I II IIIIIIIIIIIIIMIIIIM 

CGCCTGCGGTTTGGCGACATTCGGATCCCCGGAACTAGCTCTGCAATGACCTGCGCGCCG 282 
AGGGAAGCGAGGTGGGTGGCAGGTTTTAGTGCGGGTTTAAGCGTTGCCAGGCGAGTGGTG 24 0 

Mill M II I II M I II I! II 1 1 II 1 1 1 1 1 II I M 1 1 Ml II M II I i 1 1 1 M 1 1 M 

AGGGAGGCGAGGTGGGTGGCAGGTTTTAGTGCGGGTTTAAGCGTTGCCAGGCGAGTGGTG 342 
AGCAAAGACGCTAGTCTGGGGAGCGAAACCATATTGAGTCATCTTGGCAGAGCATGCACA 3 00 

Mil llllllll MM IIIIIIMIIIItlllllllMIIMIIMIIIMIIIIMII 

AGCAGAGACGCTAGTCTGGGGAGCGAAACCATATTGAGTCATCTTGGCAGAGCATGCACA 4 02 
ATTCTGCAGGGCATAGATTGGTTTTGCTCGATTTACAATGTGATTTTTTCAACAAAAATA 3 6 0 

Ml MM MMMM MMMMMMMMMMM MM MM MMMMMIMI 

ATTCTGCAGGGCATAGATTGGTTTTGCTCGATTTACAATGTGATTTTTTCAACAAAAATA 4 62 
ACACTTGGTCTGACCACATTTTCGGACATAATCGGGCATAATTAAAGGTGTAACAAAGGA 42 0 

| I I I I I I I ! I I I 1 1 1 1 1 1 1 I I MINI I I I I I I I I I I I 

ACACATGGTCTGACCACATTTTCGGACATAATCGGGCATAATTAAAGGTGTAACAAAGGA 522 
ATCCGGGCACAAGCTCTTGCTGATTTTCTGAGCTGCTTTGTGGGTTGTCCGGTTAGGGAA 4 80 

II I II I Mill 1 1 M Mill II 1 1 IMM I Ml MM I M I M II I 1 1 1 1 II 1 1 II I 

ATCCGGGCACAAGCTCTTGCTGATTTTCTGAGCTGCTTTGTGGGTTGTCCGGTTAGGGAA 582 
ATCAGGAAGTGGGATCGAAA 500 

MMMMMMIMM 



Query 


1 


Sbjct 


103 


Query 


61 


oXJ J C u 




Query 


121 


Sbjct 


223 


Query 


181 


Sbjct 


283 


Query 


241 


Sbjct 


343 


Query 


301 


Sbjct 


403 


Query 


361 


Sbjct 


463 


Query 


421 


Sbjct 


523 


Query 


481 


Sbjct 


583 



>gi I 62086196 |dbj I AB191244 . 1 1 Corynebacterium glutamicum ldhA gene, promoter region 
Length=34 8 



Score = 658 bits (332), Expect = 0.0 
Identities = 344/348 (98%), Gaps = 0/348 (0%) 
Strand=Plus/Plus 



Query 


150 


Sbjct 


1 


Query 


210 


Sbjct 


61 


Query 


270 


Sbjct 


121 


Query 


330 


Sbjct 


181 


Query 


390 


Sbjct 


241 


Query 


450 


Sbjct 


301 



CGGAACCAGCTCTGCAATGACCTGCGCGCCGAGGGAAGCGAGGTGGGTGGCAGGTTTTAG 2 09 

III! ' I : II I! Illllllllllllllllllllll 

CGGAACTAGCTCTGCAATGACCTGCGCGCCGAGGGAGGCGAGGTGGGTGGCAGGTTTTAG 60 

TGCGGGTTTAAGCGTTGCCAGGCGAGTGGTGAGCAAAGACGCTAGTCTGGGGAGCGAAAC 2 69 

IMIIII IIIIIIIIIIMMIIIIIIIIMIII IMIIII I III MM I III III I 

TGCGGGTTTAAGCGTTGCCAGGCGAGTGGTGAGCAGAGACGCTAGTCTGGGGAGCGAAAC 12 0 

CATATTGAGTCATCTTGGCAGAGCATGCACAATTCTGCAGGGCATAGATTGGTTTTGCTC 32 9 

1 1 1 1 1 1 II i II 1 1 1 1 1 1 1 1 1 1 1 Ml 1 1 1! II 1 1 1 II I Mil 1 1 II M 1 1 1 1 II 1 1 1 1 1 

CATATTGAGTCATCTTGGCAGAGCATGCACAATTCTGCAGGGCATAGATTGGTTTTGCTC 180 

GATTTACAATGTGATTTTTTCAACAAAAATAACACTTGGTCTGACCACATTTTCGGACAT 3 89 

1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 r 1 1 1 1 1 1 1 E 1 1 1 1 1 1 1 1 

GATTTACAATGTGATTTTTTCAACAAAAATAACACATGGTCTGACCACATTTTCGGACAT 24 0 

AATCGGGCATAATTAAAGGTGTAACAAAGGAATCCGGGCACAAGCTCTTGCTGATTTTCT 44 9 

1 1 i 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 M 1 1 1 M I It 1 1 1 11 1 M I II ! 1 1 1 1 It 1 1 

AATCGGGCATAATTAAAGGTGTAACAAAGGAATCCGGGCACAAGCTCTTGCTGATTTTCT 3 00 
GAGCTGCTTTGTGGGTTGTCCGGTTAGGGAAATCAGGAAGTGGGATCG 4 97 

I 1 1 I II 1 1 II M II M 1 1 1 1 1 1 1 1 1 i II 1 1 1 1 II M 1 1 1 1 1 1 i 1 1 1 

GAGCTGCTTTGTGGGTTGTCCGGTTAGGGAAATCAGGAAGTGGGATCG 34 8 



>gi | 50428370 | dbj | AB115088 . 1 | Corynebacterium glutamicum ldhA gene for lactate 
dehydrogenase , 
complete cds 
Length=1456 

Score = 383 bits (193), Expect = 4e-103 
Identities = 196/197 (99%), Gaps = 0/197 (0%) 
St rand= Plus/Plus 



Query 


304 


Sbjct 


1 


Query 


364 


Sbjct 


61 


Query 


424 


Sbjct 


121 


Query 


484 



CTGCAGGGCATAGATTGGTTTTGCTCGATTTACAATGTGATTTTTTCAACAAAAATAACA 3 63 

III IIIIIMI II llllll MIMIIMIM MIMMIMI Mill MIIIMI MM 

CTGCAGGGCATAGATTGGTTTTGCTCGATTTACAATGTGATTTTTTCAACAAAAATAACA 60 

CTTGGTCTGACCACATTTTCGGACATAATCGGGCATAATTAAAGGTGTAACAAAGGAATC 423 

I Ml I II III 1 1 1 1 1 1 1 1 1 1 I II I I MM I II 1 1 MM 1 1 1 1 I M I ! II I 1 1 1 1 1 1 1 1 

CATGGTCTGACCACATTTTCGGACATAATCGGGCATAATTAAAGGTGTAACAAAGGAATC 120 

CGGGCACAAGCTCTTGCTGATTTTCTGAGCTGCTTTGTGGGTTGTCCGGTTAGGGAAATC 483 

II I MMIMMIMI Mill I IMIMIM MIMIIMIM I III IMMIM MIM 

CGGGCACAAGCTCTTGCTGATTTTCTGAGCTGCTTTGTGGGTTGTCCGGTTAGGGAAATC 180 



AGG AAGTGGG AT CGAAA 

III MMIMMIMI 



500 



Sbjct 181 AGGAAGTGGGATCGAAA 197 



Gene, 90 (1990) 43-49 
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SUMMARY 



It was shown that the Escherichia coli lacZ gene could be expressed in the cyanobacterium Synechococcus R2 PCC7942 
both as a plasmid-borne form and also integrated into the chromosome. A promoterless form of the lacZ gene was constructed 
and used as a reporter gene to make transcriptional fusions with cyanobacterial promoters using a shuttle vector system and 
also via a process of integration by homologous recombination. Synechococcus R2 promoter-/** gene fusions were then used 
to identify C0 2 -regulated promoters, by quantitatively assessing 0-galactosidase activity under high and low C0 2 conditions 
using a fluorescence assay. Several promoters induced under low C0 2 conditions were detected. 



INTRODUCTION 

Cyanobacteria are capable of oxygen-evolving photo- 
synthesis and the utilization of C0 2 as a sole source of 
carbon. Unicellular cyanobacteria oflfer attractive systems 
for the study of photosynthesis, particularly so when gene 
transfer is possible by either natural or recombinant means. 
Synechococcus R2 PCC7942 is an organism which is most 
readily transformed, and efficient shuttle vectors have been 
in use for some time (Kuhlemeier et al., 1981). 



Correspondence to: Dr. DJ. Scanlan, Department of Biological Sciences, 
University of Warwick, Coventry CV4 7AL (U.K.) Tel. (0203)523544; 
Fax (0203)523701. 

Abbreviations: aa, amino acid(s); Ap, ampicillin; j&Gal, 0-galactosidase; 
bp, base pair(s); C„ inorganic carbon; Cm, chloramphenicol; cpc, gene 
encoding phycocyanin; DMSO, dimethylsulfoxide; kb, kilobase(s) or 
1000 bp; Km, kanamycin; K ml Michaelis-Menten constant; MCS, mul- 
tiple cloning site; MUG, 4-methyl umbeUiferyl-J?-D-galactopyranoside; 
nt, nucleotides); ONPG, o-nitrophenyl-P-D-galactopyranoside; Pollk, 
Klenow (large) fragment of £. coii DNA polymerase I; R , resistant/ 
resistance; RuBisCO, D-ribulose 1,5-bisphosphate carboxylase/oxy- 
genase; TE, 10 mM Tris/1 mM EDTA pH 8.0; XGal, 5-bromo-4-chloro- 
3-indolyl-^-D-galactopyranoside; [ J, denotes plasmid-carrier state. 



Cyanobacteria, together with some unicellular eukaryotic 
phototrophs and certain lower aquatic plants, are capable 
of concentrating exogenous bicarbonate (see Badger, 1987) 
thereby providing a high internal concentration of C0 2 for 
RuBisCO and overcoming the high K m of this enzyme for 
its substrate. Recently, mutants have been isolated which 
require high C0 2 conditions to grow (see Marcus etal., 
1986; Abe et al., 1988) and their use may yield information 
on the molecular nature of Q uptake. 

Various reporter genes, e.g., cat (Friedberg and Seijffers, 
1986) and lux (Schmetterer et al., 1986) have been used to 
assess the expression of specific genes in cyanobacteria. 
Expression of fiGei in the marine cyanobacterium Synecho- 
coccus sp. PCC7002, has been reported (Buzby et al., 1985) 
and applied to assessment of the efTect of light intensity and 
nitrogen availability on cpc-lacZ gene fusions (Gasparich 
et al., 1987), Such studies show that lacZ gene fusions can 
be used to monitor gene expression in cyanobacteria. It was 
consequently decided to take the approach that 
cyanobacterial DNA-facZ gene fusions could be used to 
identify presumptive C0 2 -regulated promoters by the dif- 
ferential activity of /JGal under high and low C0 2 condi- 
tions. 



0378-11 19/90/503.50 ® 1990 Elsevier Science Publishers B.V. (Biomedical Division) 



RESULTS AND DISCUSSION 

(a) Synthesis of flGal in Synechococcus R2-SPc PCC7942 

Two strategies are applicable, in cyanobacteria, to the 
construction of recombinants in which the expression of a 
reporter gene is driven from a cyanobacterial promoter. 
Either the hybrid may be introduced into the chromosome 
by homologous recombination or be expressed from an 
independently replicating shuttle vector (Table I). Using the 
latter approach we have obtained gene fusions in which 
expression of the reporter gene was driven by promoters 
whose activity was controlled by C0 2 availability. It was 
decided to use lacZ as the reporter gene since it had been 
shown to be efficiently expressed in the unicellular cyano- 
bacterium Synechococcus PCC7002 (Buzby et al., 1985). 

To confirm that the lacZ gene could be expressed in 
Synechococcus R2-SPc it was introduced into the E. colij 
Synechococcus shuttle vector pUCIOS (Kuhlemeier et al., 
1981), by ligating a 4.2-kb EcoRhSall fragment from 
pTEBG3 into EcoKl + Sa/I-digested pUCIOS. The re- 
sulting 14-kb plasmid pTUCl was introduced into 
Synechococcus R2-SPc. Ap R transform ants were obtained 



at a frequency of 10 2 -10 3 //xg DNA. The presence of the 
lacZ gene was confirmed by Southern blotting and by the 
fluorescence of transformants after spraying with MUG. 
(We found that XGal was a less suitable indicator because 
the endogenous pigmentation of Synechococcus colonies 
obscured the indicator colour.) In addition, MUG can only 
detect /JGal present in bacteria at the time the substrate is 
applied, in contrast with colour reactions produced by bac- 
terial colonies grown on agar plates that include XGal, 
which reflect substrate hydrolysis throughout the develop- 
ment of the colony, 

(b) Construction of generalised promoter-probe vectors 
Fig. 1 A describes the construction of this new lacZ frag- 
ment and of pDAH216 and pDAH274. The sequence 
around the 5' end of lacZ showing the fusion point with 
trpA and stop codons is illustrated in Fig. IB. 

(c) Construction of the lacZ promoter probes for use in 
Synechococcus 

Fig. 2 describes construction of the new lacZ promoter 
probe plasmids, based on the shuttle vector pUClOS 



TABLE I 

Bacterial strains and plasmids 



Strain or plasmid 0 



Characteristics" 



Source/reference 



Escherichia coll 

DH1 

MCI06I 

Synechococcus 
PCC7942 

Plasmids 

E.coU 

pTEB03 

pREG422 

pIC19H 

pDAH216 

PDAH274 

Synechococcus 

pUCIOS 

pUC303 

pUClOSXS 

pUClOSH 

pTUCl 

pLACPBl 

PLACPB2 



F~,recA \ t endA\ f gyrAH 

araD 139, A(ara-leu)W7 % dfocX74, gaIK~ , hsdR', hsdM* , strA 



R2-SPc (small plasmid cured) 



Ap R ,pBR322::tocZ 

Ap R 

Ap R 

Ap R , promoterless lacZ 

Ap R , Km R , promoterless tocZ, PI inc 



Ap R , Cm R 
Sm R Cm R 

Ap R , Cm R , XhohSatl deletion 

Cm R HMUl deletion of pUCIOS 

Cm R , Ap R lacZ 

Ap R , Cm R , promoterless lacZ 

Ap R , Cm R , transcription terminators, promoterless lacZ 



Maniatisetal. (1982) 
Casabadan and Cohen (1980) 



This laboratory 



S. Elledge 

Shimkets et al. (1983) 
Marsh etal. (1984) 
This study 
This study 



Kuhlemeier etal (1981) 

Kuhlemeier etal (1983) 

This study 

This study 

This study 

This study 

This study 



* The cyanobacterium Synechococcus R2-SPc PCC7942 was grown at 34°C in Allen's medium (Allen, 1968) in an orbital shaker and illuminated at a 
light intensity of 30-40 pE/m z /s. Low C0 2 cultures were grown in an environment gassed with air and high C0 2 cultures in a gas phase of 5% (v/v) 
C0 2 in air with Allen's medium supplemented with 10 mM NaHC0 3 . Solid medium contained 1.5% (w/v) Bacto agar with the agar and Allen's medium 
autoclaved separately. Antibiotic concentrations used for Synechococcus R2-SPc grown in liquid medium were I /ig Ap/ml and 10 /ig Cm/mL E. coU was 
grown in nutrient broth or on nutrient agar at 37°C Antibiotic concentrations in both liquid and solid medium were SO jig Ap/ml and 30 /ig Cm/ml. 
4 deletion; PI inc, incompatability region for phage PI. 
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CTAW i iftivB l 'C'jm l IW 
(jplC19H 
cut f C & B2 

ItSiBaS 




3S -jo operator RBS Met Thr Met lie 

fegtonjff GCTTTACACTTT A TGCTTCCGGCTCG TATGTT GTGTG TGGAA TTGTGAGCGGA TAACAA 77 T CACAC AGSA AACAGCT AT6 ACC AT6 ATI 



** r Region GCG m ACG 2c AGT TaI TCCCACAGCCGCCAGTTCCGCTGG^ 



rbs Met Thr Met He 

m M TcG 2 AGT m TCCCACAGCCGCCAGnCCGCTGGCGGCATTTTAACTTTCTTTAT CACAC AGGA AACAGCT ATG ACC ATG ATT 

Fu3ionA»209 
Flg . ,. G>nstructionofthegeneral^^ 

•Sm JSSm (Casadaban et al 1980) was modified to: (f ) remove as much as possible of the unwanted trpA DNA upstream from the facZ. whtfst 
S ZHSSSS to thus ensuring that protein M» we« not possible; («) remove the £eoRI 

Se^^fZene without^ 

framnents The MCS includes a, now unique, EcoM restriction site. Plasmid pTEB03 contains a modified locZ gene in which the EeoKmt ">">f"W° 
SeX^mfjeneT Howeve^ same aa are still encoded in the rejon of the h ^~It^^^ 

• - nAU11 , nfvl n nAH274 A 2 6-kb AimHI-Cfol fragment of pTEBG3 containing the C-termroal half of tacZ was ugawa 10 r 

£ fraSnt lacking the EoRI site in tSe terminal end of the tacZ gene. There are many r«,I «^^' M ™™™^ { Z 

SS was therugatd to Mil * H**m "f* 
produce P DAH274. Abbreviations^.*^ 

S,S^Sm,5m«l;Tq,r fl ,I;X.^I;T.t™^^ ^ 
sequences, whilst thick lines represent drug resistance genes. facZ, PI inc (incompatibility region ror pnage ri/o ^ D DAH274 (S. 

B)S^ce ofthe 5' end ofZz and the 3' end of ^ (Casadaban e, aL. 1980) ^^^^^fj^^^. 
McOowan. personal communication* Italicized letters represent bp not present in the «pA\\atZ rusion of JW205. RBS. nbosome. umg 
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Fig. 2. Construction of Synechococcus lacZ promoter probes, The BamHl site in pUC105 was removed by fiilmg-in cut ends using Pollk, and the resulting 
vector (pUCIOS Bam-) cut with Xhol + Sail. Deletion experiments with pUC105 showed that nXhohSall deletion did not affect transformation frequency 
(Table II). This was in contrast to deletion of a 4-kb /f Mill fragment from pUCIOS which completely abolished replication of pUCIOS in Synechococcus 
Rl This would agree with the proposal that the cyanobacterial replication origin of this plasmid is contained within a 4.65-kb BamHl-Xhol restriction 
fragment (Gendel, 1987). Insertion of a promoterless lacZ gene from pDAH216, or pDAH274 (containing transcription termination signals at either end 
of the lacZ gene), into the large Xhol-Sall fragment of pUCIOS Bam- produced pLACPBl and pLACPB2, respectively. 



(Kuhlemeier etal., 1981) for use in Synechococcus R2 
PCC7942. Both vectors contain a unique BamHl site for 
insertion of cyanobacterial chromosomal DNA. Syne- 
chococcus R2-SPc chromosomal DNA libraries were con- 
structed with these vectors, cloning partial Sau5 AI digested 
chromosomal DNA into this unique BamHl site. Trans- 
formation frequencies of up to 10 6 transformants//*g DNA 
were obtained (Table II). 

Although plasmid promoter probe vectors have been 
widely used for studying gene fusions, the background 
expression of an intact lacZ gene on a multicopy plasmid, 
even lacking a recognisable promoter, is apparently quite 
high (Casadaban et al., 1980). Thus, it may be difficult to 
distinguish strains carrying the desired fbsion from those 
carrying the parent plasmid. pLACPBl indeed exhibited 
some endogenous /JGal activity (Table III). However, using 
pLACPB2 which contained a transcription termination 
signal upstream from the site of insertion of chromosomal 
DNA, expression of jJGal was reduced twofold. This dif- 
ference reflects a transcriptional effect since the plasmid 
origin of replication (and hence the plasmid copy number) 
is the same in each case. 

(d) Use of pLACPBl and pLACPB2 to identify C0 2 - 
regulated promoters 

Synechococcus R2-SPc chromosomal DNA libraries, 
constructed in pLACPBl and pLACPB2 with approx. 4-kb 



fragments generated by Sow3AI partial digestion, were used 
to transform Synechococcus R2-SPc under normal low C0 2 
conditions. Transformants were restreaked onto Allen's 
medium containing 7.5 /ig Cm/ml, and then replica-plated 
onto solid medium containing Cm + 10 mM NaHC0 3 . 
Plates were placed inside sealed gas bags before gassing 
with 5% C0 2 in air. After five days, corresponding high and 
low C0 2 transformation plates were sprayed with MUG 
and photographed, This initial screening allowed a pre- 
liminary identification of transformants exhibiting C0 2 - 
regulated expression of /?Gal (Fig. 3). Interesting trans- 
formants were then grown in liquid medium under high and 
low C0 2 conditions, and /JGal was assayed throughout the 
growth curve using the MUO assay. Generally, 0Gal 
activity increased proportionately with growth, though a 
few transformants showed a slight decrease when reaching 
stationary phase. Differences in /JGal activity under high or 
low C0 2 conditions were observed in 8 of 600 pLACPB 1 
or 17 of 2500 pLACPB2 transformants screened - showing 
either greater or lesser /?Gal activity under the different C0 2 
concentrations. Table III shows some examples. C0 2 con- 
centration did not significantly affect /Krai activity in con- 
trol cultures. 

Recent observations suggest that light intensity may also 
have a controlling effect on lacZ expression of individual 
transformants (data not shown). This is in agreement with 
the idea that metabolic conditions within the cell might be 
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TABLE II 

Transformation frequencies for Synechococcus R2*SFc shuttle vector, 
integrative vector and promoter-probe vectors 



Plasmid/selection* Transfonnants/ 

|ig DNA b 



pUClOS 


Cm* 


10 s 


pUCtOS Xhol-Sall deletion 


Cm R 


10* 


pUC105 HinMl deletion 


Cm* 


zero 


pLACPBl 


Cm* 


I0 6 


pLACPBl chromosomal DNA library 


Cm* 


10M0* 


pLACPB2 


On* 


\<fi 


pLACPB2 chromosomal DNA library 


Qn R 


10M0 3 


pTUCl 


Ap R 


10M0 5 


pDAH274 


Km R 


zero 


pDAH274 chromosomal DNA library 




10M0 4 


4-kb DNA fragments 


Km R 



* Small-scale plasmid isolation from cyanobacteria used the rapid 
bow^nietltodofHolamaiidQu 

Cyanobacterial chromosomal DNA extraction was based on a method 
described by lind et al. (1985) with modifications. A late-log phase col- 
tore (25 ml) was spun in a MSE mnhex centrifuge at 5000 rpm for 10 mm, 
resuspended in OlS ml 0L25 M Iris pH 8.0/20% (w/v) sucrose/rysozyme 
lOmgpermUandtheeelbweiemcubatedfbr I hat 37°GSarkosyi(i6fd 
of 30% (v/v) solution) and 20 pi of proteinase K (5 mg/ml) was then 
added, and the cells incubated at 65°C for 1 h. An equal volume of 
phenol : chloroform was added, and the mixture vortcxcd and spun for 
4 mm in an eppendorf centrifuge. Jht supernatant was diahysed overnight 
against TE buffer and stored at -20°G Plasmid constructions and trans- 
formation off. caff were performed by standard techniques described in 
Martians et aL (1982). Restriction enzymes (Amersham International) 
were used under conditions recommended by the inanufacturcrs. 
b Transformation of Synechococcus R2-SPC PCC7942 was performed as 
described by Kuhlemeier et aL (1 98 1> Where appropriate, translbnnante 
were replica plated onto solid medium containing 10 mM NaHG0 3 plus 
antibiotic (7 J ug Cm/ml or 1 ug Ap/ml). These plates were placed inside 
sealed plastic bags containing an atmosphere of 5% (v/v) CO} in air and 
continuously Uhnninated. 



similar under low C0 2 levels and high light intensities, and 
follows the identification of a 42-kDa cytoplasmic mem- 
brane protein from Synechococcus R2 which has been 
shown to be regulated by C0 2 concentration and light 
intensity (Reddy etal., 1989). Using both MUG and 
ONPG assays it was shown that many of the C0 2 - 
regulated promoters were functional in E. cott (data not 
shown). 

We have recently constructed a Synechococcus R2 gene 
library directfy into pDAH274, a vector incapable of inde- 



pendent replication in cyanobacteria. Using this insert- 
directed integration system transfonnants were obtained at 
high frequency (Table IIX were stable in the presence of 
Km, and showed differential lacZ expression pDAH274 
without inserts failed to transform Synechococcus R2-SPC. 
The control of the lacZ gene from cyanobacterial promoters 
maintained solely on the chromosome simplifies problems 
of plasmid copy number. 



TABLE III 

Expression otlacZ in selected Synechococcus R2-SPc transfonnants grown under low and high CO a conditions 



Transformant Q 



0Gal activity (MUG units b ) 



Ratio of 0Gal activity 
(low COj/high CO a ) 



Air level C0 2 


5% (v/v) COj in air 




0.6 


0.6 


1.0 


46.0 


320 


1.4 


29.0 


U7.0 


0.25 


1560.0 


450.0 


3.5 


82.0 


29.0 


2.8 


2630.0 


79.0 


33.0 


24.0 


17.0 


1.4 


92S.0 


232.0 


4.0 


1955.0 


223.0 


9.0 


370.0 


57.0 


6.5 


423.0 


42,0 


10.0 



Synechococcus R2-SPc (untransformed) 
pLACPBl control 
8 
10 
14 
19 

pLACPB2 control 
4 
5 
9 
17 



■ See Table I. Transformations were carried out as described in Table II, footnote b. Nos. 8 f 10, 14 and 19 represent specific pLACPBl chromosomal 
DNA library transfonnants. Nos. 4, 5, 9 and 17 are specific pLACPB2 chromosomal DNA library transformants. 

b 0Gal activity was assayed using either ONPG as described by Miller (1972) and data are expressed as the increase in ^ 420 /min/ml/mg protein, or using 
MUG, a quantitative fluorimetric assay for 0Gal specific activity, carried out as described by Youngman (1987). MUG units represent pmol MUG 
hydrotysed/ml/mm standardised for culture density. 
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Fig. 3. Differential lacZ activity by Synechococcus R2-SPo[pLACPB2] transforniants grown under high (A) and low (B) CO, ^nditions as assessed by 
spraying Plates with MUG. MUG was applied after patched bacterial colonies had developed, by spraymg the plate with a MUG sohmon (10 ntfml 
in D^oTp^tcs were held 30 cm away from the atomizer nozzle, and a fine spray of MUG was delivered over the surface of ^ a ^. ^^ l0 ^ 
plates were visualised under long wavelength ultraviolet light and photographed using Polaroid 667 film at fll for an 1/8 of a second using a Kodak No. 
45 Wratten gelatin filter. Magnification, x 0.7. The arrow indicates a transfonnant showing greater /JGal activity under low C0 2 conditions. 



(e) Conclusions 

This study describes the construction of lacZ promoter 
probe vectors and their modification and use in the uni- 
cellular cyanobacterium Synechococcus R2 PCC7942. 

(1) The lacZ gene was shown to be expressed in this 
organism from both an endogenously replicating plasmid 
and also integrated into the chromosome. 

(2) Plasmids pLACPBl and pLACPB2 are lacZ pro* 
moter probes for use in Synechococcus R2 with a replication 
origin functional in this organism, and which transform 
Synechococcus R2 at high frequency. In addition to the 
various presumptive C0 2 *regulated promoters described 
here we have also identified promoters regulated by iron 
and magnesium limitation (data not shown). These plas- 
mids allow relatively easy isolation of the promoter frag- 
ment which can then be used to clone the whole gene which 
would enable various functional studies. This approach 
thus allows an alternative molecular approach to studying 
for example inorganic carbon uptake in this organism. 
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SUMMARY 

We report yeasty Escherichia coli shuttle vectors suitable for fusing yeast promoter and coding sequences to 
the lacZ gene of coli. The vectors contain a region of multiple unique restriction sites including EcoRI, Kpnl> 
Smah BamUl, Xbal y Sail, Pstl, Sphl and HindllL The region with the unique cloning sites has been introduced 
in both orientations with respect to lacZ and occurs proximal to the eighth codon of the gene. All the restriction 
sites have been phased to three different reading frames. 

Two series of vectors have been constructed. The first series (YEp) has two origins of replication {ori\ i.e., 
of the yeast 2\i circle and of the ColEl plasmid of E. colU and can therefore replicate autonomously in both 
organisms. These shuttle vectors also have the Ap R gene of E. coli and either the yeast LEU2 or URA3 genes 
to allow for selection of both E. coli and yeast transformants. The second series of vectors (Yip) are identical 
in all respects to the YEp vectors except that they lack the 2\i ori. The Yip vectors can be used to integrate 
lacZ fusions into yeast chromosomal DNA. None of the vectors express /?-galactosidase (/?GaI) in yeast or 
E. coli in the absence of inserted yeast promoter sequences. The 5'-nontranslated sequences and parts of the 
coding sequences of various yeast genes have been cloned into representative lacZ fusion vectors. In-frame gene 
fusions can be detected by /?Gal activity when either yeast or E. coli clones are plated on media containing XGal 
indicator. Quantitative determinations of promoter activity were made by colorimetric assay of /JGal activity 
in whole cells. Fusion of the yeast CYC1 gene to lacZ in one of the vectors allowed detection of regulated 
expression of this gene when cells were grown under conditions of catabolite repression or derepression. 



* To whom correspondence and requests for plasmids, 
sequences and reprints should be addressed. 

Abbreviations: Ap, ampicillin; 0Gal, /?-galactosidase; bp, base 
pair(s); kb, 1000 bp; MCR, multiple cloning region; LB, M63, 



WO, YPD, see MATERIALS AND METHODS, section a; nt, 
nucleotide^ ); ONPG, o-nitrophenyl-/f-£/-galactoside; on\ origin 
of DNA replication; PA, polyacrylamide; Pollk, Klenow (large) 
fragment of E. coli DNA polymerase I; XGal, 5-bromo-4-chloro- 
indolyl-/J-D*galactoside; 2/i, yeast 2\i circular plasmid DNA. 
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INTRODUCTION 

Fusion of DNA sequences to the lacZ gene of 
E. coli provides a convenient means of studying 
prokaryotic and eukaryotic promoters and regu- 
latory elements (Bassford etal., 1978; Guarente, 
1983; Rose and Botstein, 1983), The ability of 
Saccharomyces cerevisiae to synthesize active /?Gal 
has been extensively exploited to identify and deline- 
ate regulatory sequences in the 5' non-coding 
regions of yeast genes (for example see Rose et al., 
1981; Guarente and Ptashne, 1981; Guarente and 
Mason, 1983; Guarente etal., 1984; Struhl, 1982; 
Lucchini et al., 1984). The /?Gal fusions employed in 
such studies involve ligation of the 5' upstream 
regions and part of the coding region of a yeast gene 
to the lacZ gene of E. coli lacking the promoter 
sequences, the translational signals and the first 
seven codons. 

To simplify the construction of lacZ fusions in 
yeast we have developed a set of vectors capable of 
accepting DNA fragments compatible with all 
restriction enzyme recognition sites present in the 
multiple cloning region of the plasmid pUC18 
(Yanisch-Perron et al., 1985). These yeast/£. coli 
shuttle vectors contain the lacZ gene starting from 
the eighth codon fused to the multiple cloning region 
of pUC18 with either the Hindlll or the EcoKl site 
proximal to the E. coli gene. The restriction sites of 
the multiple cloning region occur in all three reading 
frames with respect to the lacZ coding sequence. 
Two types of vectors have been constructed. The 
first type, designated by the prefix YEp, contains 
sequences allowing autonomous replication in E. coli 
and in yeast. These vectors also contain the E. coli 
/^-lactamase gene to confer Ap resistance, and either 
the yeast URA3 or LEU2 gene to permit prototrophic 
selection of transformants. The second set of vec- 
tors, designated by the prefix Yip, are identical to the 
YEp vectors except that the yeast 2\x circle sequence 
necessary for autonomous replication in yeast has 
been deleted. These vectors can be used to integrate 
lacZ fusions into yeast chromosomal DNA. The 
vectors have been shown to express j?Gal in yeast in 
the presence but not absence of DNA inserts with 
appropriate transcriptional and translational signals. 



MATERIALS AND METHODS 

(a) Media, strains and transformations 

Non-selective medium for yeast (YPD) contained 
1% yeast extract, 2% peptone and 2% glucose. 
Selective medium for yeast (WO) contained 0.67% 
yeast nitrogen base minus amino acids and 2% 
glucose supplemented as required with tryptophan, 
uracil, histidine, adenine and leucine at 25 /ig/ml. 
E. coli was grown in LB medium (Davis et al., 1980) 
supplemented with 40 ^g Ap/ml when required for 
selection of plasmids. E medium (Davis et al., 1980) 
supplemented as required was used for selection of 
specific markers in E. coll Solid media contained 
2% or 1.5% agar for growth of yeast and E. coli, 
respectively. S. cerevisiae strain W303-1B (a 
/eu2-3,112to5-ll,15a<fe2-l ura3-\ trplA canl-\W 
[cir + ]) obtained from R. Rothstein, College of 
Physicians and Surgeons, Columbia University, 
New York, NY, was transformed with autono- 
mously replicating plasmids by the method of Beggs 
(1978). Transformants were selected on minimal 
glucose media lacking either uracil or leucine but 
supplemented for the other auxotrophic require- 
ments of W303-1B. £. coli strain RR1 (proA, leuB6, 
lacY, galK2, xyl-5, mtl-\ y ara-14, rpsL20, supEM, 
hsdS, A " ) was used for maintenance of plasmids and 
for selection of plasmids containing the yeast LEU2 
gene. E. coli strain MC1066 (F" f MacXIA, hsdR, 
rpsU galU, galK, trpC9S30 y leuB6, pyrF\\Tn5) was 
used for selection of plasmids containing the yeast 
URA3 gene. The j?Gal-deficient E. coli strain 
MC1009 {araD\3% AlacX74, Aara-leul%l, galU, 
galK, strA, recA 56, srl : :Tn70, relA, spoT) was used 
to test expression of the plasmid copy of lacZ in 
£. coli by plating on LB medium supplemented with 
50 fig XGal/ml. All bacterial transformations were 
by the CaCl 2 procedure (Cohen et al., 1972). 0Gal 
activity was tested in yeast by plating on M63 salts 
medium supplemented with 40 /ig XGal/ml 
(Guarente, 1983). 

(b) Miscellaneous procedures 

Standard techniques were used for preparation of 
recombinant plasmids from E. coli, restriction 
enzyme digestions, agarose gel electrophoresis, iso- 
lation of restriction fragments from agarose gels, 
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ligation of restriction fragments and screening of 
transforming DNAs (Maniatis et al., 1982). Frag- 
ments with protruding 5' ends were converted to 
blunt-ended fragments using the Pollk (Maniatis 
et al., 1982). Controlled exonucleolytic digestion of 
double-stranded DNA was accomplished by treat- 
ment of approx. 1 }x% DNA with 50 units of S 1 
nuclease in 50 fi\ of 30 mM NaCl, 1 raM ZnCl 2 , 
35 mM sodium acetate, pH 4.75, for 5 min at 37 °C. 
The reaction was stopped by adding 100 ^1 of 
100 mM Tris • HC1, pH 7.5, 10 mM EDTA, followed 
by phenol extraction. This SI nuclease treatment 
resulted in the loss of 7-14 nt from each end of the 
molecules. DNA sequences were determined by the 
method of Maxam and Gilbert (1977). Quantitative 
determination of jSGal activity in yeast cells was 
performed by measuring hydrolysis of ONPG as 
described (Guarente, 1983). 



RESULTS AND DISCUSSION 

(a) Construction of lacZ fusion vectors for expres- 
sion of 0Gal in yeast: YEp353, YEp354 and YEp355 

The jSGal fusion vector pMC1403 (Casadaban 
et al., 1980) was modified by Minton (1984) to allow 
fusion to the lacZ structural gene in three reading 
frames. Three plasmids designated pNM480, 
pNM481, and pNM482, all contain the multiple 
cloning region of the plasmid pUC8 (Vieira and 
Messing, 1982) upstream from the lacZ gene, with a 
phase correction between the Hindlli site and the 
eighth codon of lacZ (Minton, 1984). The availability 
of these plasmids suggested a simple means for 
introducing the three different pUC8//acZ sequences 
into a yeast shuttle vector. For this purpose we chose 
the episomal plasmids YEp351 and YEp352 (Hill 
et al., 1986) both of which contain the entire pUC18 
sequence, the yeast 2pi origin of replication, and the 
wild-type LEU2 or URA3 genes, respectively. 
Initially the 3.15-kb EcoRl-Dral fragment of each 
pNM vector was ligated separately to YEp351 or 
YEp352 from which 215 bp between the EcoRl and 
Narl sites had been removed (Fig. 1). The resultant 
plasmids YEp353A, YEp354A, YEp355A and 
YEp363A were capable of replicating in E. coli and 



in yeast and of complementing the leu2 or ura3 
mutations of an appropriately marked yeast strain. 
Some of the plasmids, however, expressed /?Gal 
activity when yeast or E. coli transformants were 
plated in the presence of XGal. The synthesis of 
/?Gal is probably due to the presence of the lac 
promoter and an ATG start codon upstream from 
the multiple cloning regions of YEp351 and YEp352. 

The lac promoter region was removed from 
YEp352 as shown in Fig. 1. YEp352 was digested to 
completion with PvuW to eliminate the entire lacZ' 
region of pUC18 as well as the operator/promoter 
and part of the lad gene. The digestion mixture was 
then briefly treated with S 1 nuclease to remove an 
ATG codon located immediately 5' of the Pvull site 
within the lad gene. The 8.0-kb vector band was 
purified and was ligated to a blunt-ended EcoRl 
linker; the sequence of this linker was 
5'-CCCGGATTCGGG-3\ Several different plas- 
mids containing the £coRI site were partially se- 
quenced to determine the effects of digestion with S 1 
nuclease. The plasmid YEp352E was ascertained to 
have lost 6 nt, including the ATG sequence on the 
5' side of the lacl coding sequence. 

The pUC8/tocZ sequences from the previous set 
of vectors YEp353A, YEp354A and YEp355A were 
transferred into YEp352E (Fig. 1). The 3.9-kb 
EcoRl-Ncol fragments of YEp353A, YEp354A and 
YEp355A containing the multiple cloning region, the 
lacZ gene and part of URA3 was purified from each 
vector and ligated to the large EcoRl-Ncol fragment 
of YEp352E. Following transformation of E, coli 
RR1 with the ligation mixtures, Ap-resistant clones 
were screened for plasmids having the pUC8//acZ 
sequences and the reconstituted URA3 gene. These 
plasmids designated YEp353, YEp354 and YEp355 
did not express /?Gal activity when transformed into 
either yeast or E. coli (see section e below). The 
disposition of the restriction sites in the pUC8 
multiple cloning region with respect to the lacZ 
reading frame was verified by nt sequence analysis 
(Table I). The complete nt sequences of the YEp 
vectors containing URA3 were compiled from the 
known sequences of YEp352 and the pNM vectors 
(Table II). All restriction sites of the multiple cloning 
region occur once in these constructs, with the excep- 
tion of Sad which is also present in the coding 
sequence of lacZ. 
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lacZ _ p EH P N 




URA3 (LEU2) 



3.2-kb EcoRI-Orol 
L 



4.9-kb EcoRI-Narl 

I 




1) digest with 

Pvull 

2) treat with 

SI nuclease 

3) ligate to 

EcoRI linker 




(D-N) 



3.9-kb EcoRI -Ncol 
t 



4.1-kb EcoRI -Ncol 




URA3 (LEU2) 



Fig. I . Construction of YEp353 and YEp363. Single line, pUC8 or pUCl 8 sequence. Open box, lacZ or lacY sequence, solid box. URA3 
or LEV2 sequence, cross-hatched box: 2fx circle sequence. The figure is drawn to scale for the URA3 containing vectors except for the 
0 3-kb region spanning the Pvull sites which has been expanded for detail. The indicated vector sizes apply to scaled vectors only. The 
arrangement of genes in the corresponding LEV2 containing vectors YEp351, YE P 363A and YEp363 (indicated by numbers in 
parentheses) is also represented by these figures although in this case scale is no longer maintained. Restriction sites are indicated for 
,4a/ll (A), Oral (D). EcoRi (E), HindlU (H), Narl (N), Ncol (C), and PvuM (P). The Ncol site is present only in the vectors containing 
URA3. (D-N) indicates the ligated junction of free ends created by cleavage with Narl and Dra\. where neither restriction site was 
recreated. Arrowheads indicate the lac promoter. YEp363 was formed by ligating the 2.7-kb Ao,U fragment of YEp353 to the 5.2-kb 
AatU fragment of YEp363A. YEp vectors with MCRs in the other two reading frames were constructed by repealing the manipulations 
diagrammed here using pNM48l and pNM482 as the source of the lacZ gene. 
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TABLE II 

Restriction enzyme recognition sites in YEp356 a 



Enzyme No. of sites Position of sites 



Aatll 


2 


672 


5979 








Accl 


3 


34 


3278 


3783 






Acyl 


5 


672 


2200 


2372 


5979 


6361 


Afl\\\ 


6 


1355 


2135 


2560 


2812 


3928 


Ahalll 


3 


6322 


7014 


7033 






Alul 


29 


8 


53 


144 


255 


435 






4764 


5048 


5450 


5545 


5620 






7851 










Apu I 


1 


3733 










Asu I 


14 


164 


1594 


2480 


2742 


2880 


AsuW 


1 


3843 










Aval 


4 


17 


1391 


2834 


4855 




Avail 


5 


1594 


3955 


4952 


6538 


6760 


Avalll 


2 


3284 


4849 








BamHl 


1 


22 










Bell 


I 


1399 










Bell 


3 


199 


2320 


6778 






Bin\ 


9 


22 


1462 


1754 


2795 


6255 


£>JC i I 


{ 


1551 












2 


2266 


2883 








Cau 11 


15 


17 
7412 


18 


1487 


1577 


1936 


Cfrl 


5 


524 


1522 


3319 


6509 


7951 


Clal 


1 


876 










Ddel 


16 


277 
7107 


558 
7516 


2217 


3066 


3378 


EcoKl 


] 


1 












14 


96 


223 


527 


1620 


2188 


EcoRV 


2 


1164 


3921 








KJUt 11 


5 


524 


1522 


3319 


6509 


7951 


fiue i 


& 
o 


449 


1427 


2687 


3126 


3673 


f-I I I 


zu 


514 

Alt 


508 


945 


1434 


1886 






4861 


5500 


5519 


5597 


7546 


nut ii i 




167 


281 


450 


525 


1257 






3674 


3734 


5923 


6510 


6777 


11 I 

Hga\ 


1 c 

ID 


*7fiftQ 

f O07 


/ITT 


1446 


1628 


2201 


HgiAl 


9 


7 


1989 


2398 


2513 


4219 


U „: C I 


0 


i j 




245 


849 


1301 


flglEll 




236 


3223 


7203 






HgiUl 


5 


7 


1989 


3038 


3640 


3733 


Hindi 


7 


34 


477 


1101 


2929 


3467 


tfmdlll 


1 


52 










Hinfl 


20 


32 


392 


959 


1091 


1310 






5240 


6903 


7420 


7816 


7891 


Hinmi 


12 


3 


209 


459 


958 


1092 


Hpal 


3 


477 


1101 


4561 






Hpall 


30 


18 


231 


249 


577 


1341 






3073 


3120 


3505 


4513 


5463 






7438 


7585 








Hphl 


15 


596 
7047 


828 


867 


1388 


1809 


Kpn\ 


1 


13 











7790 

1990 2290 2701 3064 3258 4051 4101 4330 4639 
5634 5850 5869 6548 6611 6711 7232 7489 7625 



3733 3734 3955 4952 5922 6538 6760 6777 6856 



6576 7040 7138 7224 



2113 2173 3120 3505 4512 5829 5864 6365 6716 



3615 4211 4636 4761 4885 5740 5975 6401 6941 



2455 2489 3034 3730 4368 4462 7630 7643 7764 



7314 7766 7777 

2009 2109 2184 3069 4427 4611 4674 4736 4799 
7916 

1428 1523 2481 2492 2688 2743 2880 3127 3320 

6857 7315 7749 7767 7778 7952 

2225 2238 2372 2861 3478 5490 5803 6361 7111 

5733 6230 6315 7476 
6949 



3603 4561 

1422 3030 4259 4632 4694 4757 4819 5014 5147 
7956 

1365 1649 1961 2867 3753 4258 5433 

1488 1578 1936 2103 2113 2173 2597 2609 2676 

5830 5864 6365 6607 6717 6784 6818 7222 7412 

2037 2392 3367 5890 5899 6183 6198 6424 6820 
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(TABLE II, continued) 



Enzyme No. of sites Position of sites 



Mael 


\ 1 


29 


3679 


4069 


4425 


5041 


5292 


5543 


5547 


6709 


7044 


7297 








Maelll 


30 


82 


102 


322 


348 


406 


520 


616 


670 


793 


807 


906 


1108 


1359 


1669 






1953 


2223 


3360 


3528 


3691 


4337 


5455 


5854 


6242 


6430 


6583 


6641 


6972 


7255 






7371 


7434 


























Mbo\ 


30 


23 


60 


174 


270 


636 


900 


1326 


1380 


1400 


1463 


1476 


1640 


1755 


1833 






2385 


2796 


3430 


6220 


6556 


6273 


6531 


6577 


6595 


6936 


7041 


7053 


7131 


7139 






7150 


7225 


























MboW 


30 


159 


272 


633 


1563 


2830 


2976 


3327 


3408 


3540 


3593 


3697 


3841 


3846 


4119 






4178 


4235 


4438 


4691 


4967 


4995 


5150 


5180 


5493 


5692 


6108 


6217 


6295 


7050 






7141 


7912 


























Mlu\ 


3 


1355 


2135 


2560 
























Mnl\ 


27 


26 


162 


279 


301 


590 


762 


1041 


1096 


1122 


2046 


2216 


3371 


3406 


3676 






3899 


5316 


5883 


5925 


6536 


6742 


6872 


6953 


7353 


7620 


7677 


7903 


7936 




Mst\ 


4 


193 


5110 


5623 


6677 






















MstU 


1 


276 




























Ncol 


1 


3902 




























Ndel 


2 


3009 


4190 


























NlalU 


27 


47 


717 


1128 


1139 


1298 


2161 


2813 


2986 


3130 


3283 


3549 


3649 


3903 


3929 






4038 


4110 


4172 


5687 


5874 


5958 


6063 


6456 


6492 


6570 


6580 


7071 


7791 




NspBU 


14 


143 


845 


1227 


1635 


2025 


2376 


2403 


2700 


2789 


3063 


5798 


6264 


7205 


7450 


NspCl 


6 


46 


2812 


3282 


3928 


5873 


7790 


















Pssl 


2 


3954 


5921 


























Pst\ 


1 


40 




























Pvul 


5 


173 


899 


1379 


1832 


6530 




















Pvull 


3 


143 


2700 


3063 
























Rsal 


13 


14 


756 


1235 


1547 


2133 


2825 


3167 


3610 


3798 


3861 


3994 


5744 


6420 




Sac\ 


2 


7 


1989 


























Sail 


1 


34 




























Seal 


2 


3797 


6419 


























Sdul 


14 


7 


1989 


2175 


2398 


2513 


3038 


3640 


3733 


3895 


4219 


5733 


6230 


6315 


7476 


Smal 


1 


17 




























Snal 


3 


2816 


3278 


3783 
























SnaBl 


1 


5203 




























Sph\ 


1 


46 




























Ssp\ 


3 


1281 


5439 


6095 
























Stul 


1 


3673 




























Taq\ 


16 


5 

6248 


35 
7692 


877 


925 


1094 


1474 


1948 


2287 


2461 


2998 


3844 


4106 


5238 


5320 


mm 


7 


4023 


4624 


4749 


5417 


7177 


7183 


7216 
















Xbal 


1 


28 




























XhoU 


9 


22 


2795 


3429 


6255 


6272 


7040 


7052 


7138 


7149 












Xmnl 


2 


5172 


6298 



























lt The vector contains 7966 bp numbered on the coding strand of lacZ, with nt 1 defined as the first nt of the MCR. Differences between 
YEp356 and other YEp vectors containing URA3 are shown in Table I. The MCR includes nt 1-62, the lacZ sequence includes 
nt 63-3175, the URA3 sequence includes nt 3231-4333, the 2\i sequence includes nt 4334-5728, and the pUC sequence includes nt 
5729-7966 and 3176-3230. 
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(b) Construction of YEp356, YEp357, YEp358, 
YEp356R, YEp357R and YEp358R 

Additional unique restriction sites were intro- 
duced into the lacZ fusion vectors upstream from the 
/?Gal gene by replacement of the multiple cloning 
region of pUC8 with that of pUC18. The multiple 
cloning region of pUC18 was isolated as a 56-bp 
EcoRl-Hiwilll fragment and ligated separately to 
YEp353, YEp354 and YEp355 digested with EcoRl 
and JfcfcdlH. The resultant recombinant plasmids 
YEp356, YEp357, and YEp358, contain unique 
Sphl, Kpnl, and Xbal recognition sites upstream 
from the lacZ gene in addition to those present in the 
pUC8 multiple cloning region (Table I). The three 
vectors also contain a Sad recognition site available 
for cloning in frame gene fusions, although this site 
is also present once in the lacZ sequence. The 
phasing of the reading frames of the unique cloning 
sites of YEp356-YEp358 relative to lacZ were 
verified by nt sequence analysis (Table I). 

To facilitate cloning of yeast genomic fragments 
defined by different upstream restriction sites, the 
multiple cloning regions of YEp356, YEp357 and 
YEp358 were inverted with respect to lacZ. A blunt 
ended form of the pUC18 multiple cloning region 
was constructed from the 56 bp EcoRl-Hindlll frag- 
ment used earlier. The fragment was first methylated 
using Hpall methylase to protect the internal 
5-CCCGGG-3' sequence from digestion with 
SmaL The methylated fragment was ligated to two 
adaptor sequences, the EcoRl-Smal adaptor 
5-GAATCCCGGG-3' and the Hindlll-Smal 
adaptor 5-AAGCITCCCGGGA-3'. These two 
adaptor sequences recreate the EcoRl and Hindlll 
sites. The high M T ligation products were digested 
with Smal and the unit-length, blunt-ended fragment 
with the multiple cloning region was purified on a 6% 
PA gel. This fragment was ligated separately to 
YEp356, YEp357, and YEp358 which had been 
digested with EcoRl + Hindlll and been made 
blunt-ended by treatment with Pollk. The ligation 
mixture was used to transform E. coli strain RR1, 
and individual clones were screened by restriction 
mapping for plasmids in which the orientation of the 
multiple cloning region was opposite that of the 
parent vectors. Three plasmids designated 
YEp356R, YEp357R, and YEp358R were confirmed 
by DNA sequence analysis to have the EcoRl site 



proximal to the lacZ gene with the unique cloning 
sites in each of the three reading frames (Table I), 

(c) Construction of lacZ fusion vectors containing 
LEU2 as a selectable marker 

A second set of lacZ fusion vectors containing the 
yeast LEU2 gene as a selectable marker was con- 
structed by transferring segments of each URA3 
containing vector to YEp363A (Fig. 1). YEp353 
through YEp358R were used to prepare a 2.7-kb 
Aatll fragment containing most of the pUC18 
sequence, the multiple cloning region, and the 
5' region of /acZ. These fragments were ligated 
separately to the 5.8-kb Aatll fragment of YEp363A 
containing the 3' region of lacZ, the yeast LEU2 
gene, and the remainder of pUC18 (Fig. 1). The 
ligation mixture was used to transform E. coli RR1 
and Ap-resistant colonies were scored for leucine 
prototrophy by complementation of the leuB muta- 
tion of E. coli. Plasmid DNA extracted from the 
Leu + clones was analyzed by restriction mapping to 
confirm reconstitution of the lacZ gene. The resul- 
tant plasmids YEp363-YEp368R contain MCRs 
with the same disposition of reading frames as the 
corresponding URA3 vectors YEp353-YEp358R 
(Table I). The complete nt sequences of these vec- 
tors compiled from the known sequences of YEp351 
and the pNM vectors shows the £coRI, Sad, and 
Kpnl sites are present twice while the remainder of 
the sites in the MCR are unique (Table III). 

(d) Construction of integrative lacZ fusion vectors 

Each of the episomal lacZ fusion vectors 
described above were converted to integrative vec- 
tors by removal of yeast 2\i circle sequences required 
for autonomous replication in yeast (Fig. 2). In the 
case of the vectors with the URA3 gene, the 5.9-kb 
region from the Aatll site of pUC18 to the Ncol site 
in URA3 was ligated to the 1.2-kb AatlhNcol frag- 
ment of the integrative vector YIp352 (Hill et al., 
1986). This fragment of YIp352 supplies the 
sequences necessary for reconstitution of pUC18 
and URA3, but does not contain the 2\l sequence 
essential for autonomous replication in yeast. A 
similar approach was used to construct integrative 
forms of the LEU2 vectors. The region of 
YEp363-YEp368R from the Kpnl site of the LEU2 
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TABLE III 

Restriction enzyme recognition sites in YEp366 a 



Enzyme No. of sites Position of sites 



AatU 


2 


672 


6437 


























Acc\ 


2 


34 


3602 


























Acyl 


5 


672 


2200 


2372 


6437 


6819 




















Aflll 


1 


4479 




























Aj\\\\ 


6 


1355 


2135 


2560 


2812 


4057 


8248 


















Aha\W 


3 


6780 


7472 


7491 
























Alu] 


25 


8 


53 


144 


255 


435 


1990 


2290 


2701 


3064 


3553 


4220 


4712 


5279 


5404 






5688 


6090 


6308 


6327 


7006 


7069 


7169 


7690 


7947 


8083 


8309 








Asul 


15 


164 
7314 


1594 


2480 


2742 


2880 


4232 


4264 


4560 


4904 


5592 


6380 


6996 


7218 


7265 


AsuW 


2 


4673 


5161 


























Aval 


4 


17 


1391 


2834 


5495 






















Avail 


7 


1594 


4232 


4560 


4904 


5592 


6996 


7218 
















AvaUl 


1 


5489 




























BamHl 


1 


22 




























Bell 


1 


1399 




























Bgll 


3 


199 


22320 


7236 
























Binl 


10 


22 


1462 


1754 


2795 


4108 


6713 


7034 


7498 


7596 


7682 










BseYl 


1 


1551 




























BstEU 


1 


4759 




























BstXl 


3 


2266 


2883 


4080 
























Caull 


14 


17 


18 


1487 


1577 


1936 


2113 


2173 


3120 


3921 


6287 


6322 


6823 


7174 


7870 


Cfrl 


5 


524 


1522 


4276 


6967 


8409 




















Clal 


2 


876 


4643 


























Ddel 


16 


277 
7565 


558 
7974 


2217 


3066 


44784 


5153 


5186 


5276 


5401 


5525 


6198 


6433 


6859 


7399 


EcoRl 


2 


1 


4157 


























EcoKll 


13 


96 


223 


527 


1620 


2188 


2455 


2489 


3034 


4017 


4764 


8088 


8101 


8222 




EcoRV 


2 


1164 


4046 


























GdiU 


5 


524 


1522 


4276 


6967 


8409 




















Hae\ 


10 


449 


1427 


2687 


3126 


3826 


3982 


4207 


7772 


8224 


8235 










Haell 


19 


214 
5501 


508 
6140 


945 
6159 


1434 
8004 


1886 
8374 


2009 


2109 


2184 


3069 


5106 


5251 


5314 


5376 


5439 


HaelM 


29 


164 


281 


450 


525 


1257 


1428 


1523 


2481 


2492 


2688 


2743 


2880 


3127 


3249 






3827 


3983 


4208 


4265 


4277 


4993 


6381 


6968 


7235 ' 


7315 


7773 


8207 


8225 


8236 






8410 




























Hga\ 


8 


7 


1989 


2398 


2513 


6191 


6688 


6773 


7934 














HgiCl 


8 


13 


233 


245 


849 


1301 


3457 


4547 


7407 














HgiEll 


4 


236 


3734 


3953 


7661 






















HgiJll 


3 


7 


1989 


3038 
























Hindi 


7 


34 


477 


1101 


2929 


3899 


4921 


5201 
















HMlll 


1 


52 




























Hind 


26 


32 


392 


959 


1091 


1310 


1422 


3030 


3413 


3696 


4122 


4289 


4455 


5093 


5159 






5272 


5334 


5397 


5459 


5654 


5787 


5880 


7361 


7878 


8274 


8349 


8414 






HinflU 


14 


3 


209 


459 


958 


1092 


1365 


1649 


1961 


2867 


3414 


4159 


5086 


5160 


6073 


Hpal 


3 


477 


1101 


5201 
























Hpall 


33 


18 


231 


249 


577 


1341 


1488 


1578 


1936 


2103 


2113 


2173 


2597 


2609 


2676 






3073 


3120 


3234 


3921 


4545 


4986 


4958 


4991 


6103 


6288 


6322 


6823 


7065 


7175 






7276 


7680 


7870 


7896 


8043 




















Hphl 


22 


596 
4999 


828 
6348 


867 
6357 


1388 
6641 


1809 
6656 


2037 
6882 


2392 
7278 


3286 
7505 


3741 


3777 


4050 


4067 


4341 


4761 


Kpnl 


2 


13 


4547 


























Mael 


8 


29 


4105 


5681 


5932 


6183 


7167 


7502 


7755 
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(TABLE III, continued) 



Enzyme No. of sites Position of sites 



MaeWl 


28 


82 


102 


322 


348 


406 


520 


616 


670 


793 


80 / 


yoo 


i ino 
1 lUo 


1 1 CG 

i35y 


lo&y 






1953 


2223 


4469 


4760 


6095 


6312 


6700 


0000 


7041 


/OVV 


TAIft 
/4JU 


T71 1 


TOO 




Mbo\ 


32 


23 


60 


174 


270 


636 


900 


1326 


13oO 


1 A f\f\ 

1400 


1 AA1 

140 J 


14 /O 


1040 


1 /55 


ion 
lo33 






2385 


2796 


3231 


4109 


4778 


6678 


6714 


6731 


6989 


7035 


7053 


73y4 


74yy 


75 1 1 






7589 


7597 


7608 


7683 






















Mlul 


3 


1355 


2135 


2560 
























Mst\ 


3 


193 


5750 


7135 
























MstU 


I 


276 




























Ndel 


1 


3009 




























main 


26 


47 


717 


1128 


1 139 


1298 


2161 


2813 


2986 


3130 


3425 




40jo 


A 1 t ") 

4112 


4255 






4268 


4473 


4955 


6332 


6416 


6521 


6914 


6950 


7028 


7038 


7529 


8249 






NspBU 


14 


143 


845 


1227 


1635 


2025 


2376 


2430 


2700 


2789 


3063 


o25o 


0/22 


7oo3 


vyoo 


NspCl 


5 


46 


2812 


4057 


6331 


8248 




















PssX 


2 


4903 


6379 


























Pstl 


1 


40 




























Pvul 


5 


173 


899 


ljyy 


lo32 


0700 




















Pvull 


3 


143 


2700 


3063 
























Rsal 


16 


14 

6202 


756 
6878 


1235 


1547 


2133 


2825 


3167 


3237 


3675 


3756 


3947 


4410 


4490 


4548 


Sacl 


2 


7 


1989 


























Sail 


1 


34 




























Seal 


1 


6877 




























Sdul 


11 


7 


1989 


2175 


2398 


2513 


3038 


3458 


6191 


6688 


6773 


7y34 








Smal 




17 




























Sna\ 




2816 




























Snaft\ 




5843 




























Sphl 




46 




























Ssp\ 


5 


1281 


3475 


4915 


6079 


6553 




















Taq\ 


19 


5 

5162 


35 
5878 


877 
5960 


925 
6706 


1094 
8150 


1474 


1948 


2287 


2461 


2998 


4644 


4674 


4680 


5085 


mm 


9 


4054 


4792 


4943 


5264 


5389 


6057 


7635 


7641 


7674 












Xbal 


1 


28 




























XhoW 


9 


22 


2795 


3230 


6713 


6730 


7498 


7510 


7596 


7607 












Xmnl 


4 


3613 


4157 


5812 


6756 























11 The vector contains 8424 bp numbered as described in Table 11. Differences between YEp366 and other YEp vectors containing LEV2 
are shown in Table I. The MCR includes nt 1-62, the lacZ sequence includes nt 63-3175, the LEV2 sequence includes nt 3231-5202, 
the 2pt sequence includes nt 5203-6186, and the pUC sequence includes nt 6187-8424 and 3176-3230. 



gene to the Seal site of the lacZ gene was replaced 
with the corresponding KpnhScal region of the 
integrative vector YIp351 (Hill et al., 1986) to yield 
a set of vectors with a deletion in the yeast 2ju 
sequence. The integrative vectors containing URA3 
as a selectable marker are designated YIp353- 
YIp358R and those containing LEU2 are designated 
YIp363-YIp368R (Table I). 



(e) Properties of the lacZ fusion vectors 

Each of the autonomously replicating plasmids 
constructed in this study were used to transform 
yeast strain W303-1B to leucine or uracil inde- 
pendence. All vectors transformed yeast at the high 
frequency seen for other episomal plasmids contain- 
ing the 2/x circle origin of replication (Broach, 1983). 
Greater than 80% of the segregants tested from 
several different transformants retained the appro- 
priate prototrophic marker after growth for 
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EH 




5.9- kb Aotll-Ncol 1.2-kb Aotll-Ncol 




Fig. 2. Construction of the integrative vector Ylp353. YIp352 
was derived from YEp352 by removal of 2ji sequence necessary 
for autonomous replication in yeast (Hill et al., 1986). Symbols 
are as in Fig. 1 . Additional restriction sites are indicated for Hpa\ 
(L) and Sspl (S). (S-L) indicates the ligation junction of free ends 
created by cleavage with Sspl and Hpal, where neither restric- 
tion site was recreated. Not all restriction sites shown in this 
drawing are unique in the vectors. Symbol ori indicates the 
position of the yeast 2p, origin of replication. 



30-40 generations in non-selective medium, indi- 
cating that the episomal plasmids are retained at a 
high copy number. 

The ability of the episomal plasmids to express 
/?Gal activity was tested in the yeast transformants 
and in the AlacZ E. coli strain MC1009. In the 
absence of yeast promoter sequences ligated into the 
multiple cloning region none of the plasmids induced 
the characteristic blue color indicative of /JGal 
activity when yeast or E. coli transformants were 
grown on plates containing XGal. Various segments 
of yeast DN A have been cloned into the appropriate 
vectors to create in-frame gene fusions to lacZ. 
Among the yeast genes tested are MRP2 coding for 
a mitochondrial ribosomal protein (Myers and 
Tzagoloff, 1986), MSD coding for the mitochondrial 
aspartyl tRNA synthetase (A. Gampel and 
A. Tzagoloff, unpublished results), CPA2 coding for 



the large subunit of carbamyl phosphate synthetase 
(Lusty et al., 1983) and CYCl coding for apo-iso- 
1 -cytochrome c (Montgomery et al., 1978). Fusion of 
the upstream regions and part of the coding sequence 
of these genes to the lacZ gene of different episomal 
plasmids described here allowed detection of /?Gal 
activity when either E. coli or yeast transformants 
were plated in the presence of XGal. Quantitative 
determinations of jSGal activity were made by 
measuring hydrolysis of ONPG by yeast cells grown 
in liquid cultures. The jSGal activity expressed from 
the plasmids differed depending on the particular 
cloned yeast promoter. The /?Gal activity present in 
a particular transformant also differed depending on 
growth conditions, indicating that promoter activity 
could be assayed in the YEp vectors by measurement 
of lacZ expression. For example, the /?Gal activity 
derived from one episomal plasmid containing the 
5 '-non translated region and first two codons of 
CYCl fused to lacZ was five times greater in cells 
grown under conditions of catabolite derepression 
(ethanol) than when the same transformant was 
grown under conditions of catabolite repression 
(glucose). These results were comparable to those 
obtained using an isogenic transformant containing 
pLG<d312, a jSGal fusion construct used previously 
for studies of regulation of CYCl (Guarente and 
Mason, 1983; Guarente et al., 1984). 

The upstream region of the CYCl gene was also 
cloned into the integrative plasmid YIp356R. The 
recombinant plasmid was linearized within the 
URA3 gene by digestion with Ncol and used to 
transform yeast strain W303-1B to uracil inde- 
pendence. Transformants were isolated in which 
greater than 99% of the segregants retained both the 
URA3 gene and lacZ activity after growth for 
30-40 generations in non-selective media indicating 
that the lacZ fusion had stably integrated into the 
yeast genome. 
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ENZYMATIC SYNTHESIS OF RNA 375 



Figure 11-5 

A schematic diagram of 
the analysis of the inter- 
action of RNA polymerase 
(RNA-P) with DNA. Identi- 
fication of (a) the pro- 
tected region (left path of 
arrows) and (b) of some 
of the bases in contact 
with the enzyme (right 
path of arrows). For clar- 
ity, only one strand of 
DNA is shown, though the 
experiment is performed 
with double-stranded 
DNA. To locate the bind- 
ing sequence the base se- 
quence of the protected 
fragment is compared to 
that of a larger segment 
of DNA. Contact points 
are identified by compar- 
ing the methylation pat- 
terns obtained when 
RNA-P is either present 
or absent 
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polymerase molecules; the number is greater when cells are growing 
rapidly. 



Site Selection: I. The Promoter 



The first step in transcription is binding RNA polymerase; to a DNA 
molecule. Binding occurs at particular sites called promoters, which are 
specific sequences of 20-200 bases at which several interactions occur. 
(A promoter is also frequently defined as a region protected by RNA 
polymerase from digestion by endonucleases.) The existence of pro- 
moters was first demonstrated by the isolation of a particular class of 
Lac - mutations in E. coll These mutations not only eliminate gene 
activity but also are noncomplementable (because they are cis-acting) 
and prevent synthesis of the RNA transcript of the lac gene. These 
mutations are called promoter mutations. 

Several events must occur at a promoter. RNA polymerase must 
recognize a specific DNA sequence, attach in a proper configuration, 
open the DNA to gain access to the bases to be copied, and then initiate 
synthesis. These events are guided by the base sequence of the DNA, the 
polymerase a subunit (without which the promoter is not recognized) 
and, for some promoters, by auxiliary proteins. The details of these 
events are not yet known, but the process can be broken down into three 
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TRANSCRIPTION 



Figure 11-6 

Segments of the noncodlng 
strand, of protected regions from 
various genes showing the com- 
mon sequence of seven bases 
(red) known as the Pribnow box. 
The start point for mRNA syn- 
thesis is shown. The "con- 
served" T is underlined. 




mRNA 
Start 

A 

GTGGA 
ACCAC A 
AAATCG 
TTGC A 
TT AC A 
GTGGA 
TTTCA * 
CTCC A 
lACAGCCA 
[GATTCA 
GCGCCCG 
CGGT AG 



-40 -30 -20 

GGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGi 

I i \ \ 

± A A A 



H 

AA 



mRNA 
start 

\ 

0 

TGTGG AATTG 



Figure 11-7 . . . . 

A region of the noncodlng strand of the promoter for the lac gene showing six mutations (red 
arrows) that affect promoter activity; A means a base deletion. The Pribnow box is shaded in red. 
Many base changes are known; all are either in or near the Pribnow box or are clustered around 
base -35 and thus define an important site (see page 377). 



parts -(a) template binding at a polymerase recognition site, (b) move- 
ment to an initiation site, and (c) establishment of what is: termed an 
open-promoter complex (shown schematically later in Figure 11-9), The 
approach to elucidating these steps for many genes has been to isolate 
the DNA segment (the promoter) that is protected by RNA polymerase 
from DNase digestion, determine the base sequence in the segment, and 
look for common features in the sequences (Figure 11-6). The specific 
sites of contact are also determined by the dimethyl sulfate; protection 
method. This is important because one might expect that the specific 
contact sites would be in the regions common to all promoters. 

The RNA molecules synthesized in vitro from each of these pro- 
moter regions must also be sequenced if one wishes to identify the 
initiation sequence, which is the sequence of the first few bajses that are 
transcribed; this sequence is just the complement of the bases at the 5' 
terminus of the RNA molecule. Additional information is obtained by 
determining the sequence of bases in promoters having mutations that 
either eliminate initiation in vivo or change the requirements for 
initiation (Figure 11-7). The rationale is that if a base change affects 
promoter activity, that base must be contained in the promoter. This 
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technique has allowed researchers to identify the bases in the protected 
segment that are actually part of the promoter. So far, 46 promoters have 
been sequenced. 



Site Selection: II. The Pribnow Box 
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Figure 11-6 shows portions of several promoter sequences in E; coli and 
E. coli phages (each promoter sequence is recognized by E. coli RNA 
polymerase) and their important features. In a region from five to ten 
bases to the left of the first base copied into mRNA is the right end of a 
sequence called the Pribnow box. All sequences found in Pribnow 
boxes are considered to be variants of a basic sequence TATAATG. The 
underscored T, at base 6 in the Pribnow box, from six to nine bases to 
the left of the first base transcribed (the distance depending on the 
distance from the Pribnow box to the transcription start point), is 
present in all promoters sequenced to date. It is called the "conserved 
T" and different sequences are usually compared by aligning conserve^ 
T's vertically, as shown in the figure. In 35 of 46 known Pribnow boxes 
in E. coli, the first two bases are TA; the variants, TG, CA, GA, and TC, 
retain one of the two TA bases. The Pribnow box is thought to be the 
sequence that orients RNA polymerase, so that synthesis proceeds from 
left to right (as the sequence is drawn), and the region at which the 
double helix opens to form the open-promoter complex (see below). 

Before enough sequences were known that the conserved T was 
recognizable, the first base transcribed was chosen as a reference point 
and numbered zero. The direction of transcription was called "down- 
stream'*; all "upstream" bases, which are not transcribed, were given 
negative numbers starting from the zero reference. The Pribnow box is 
enclosed between —13 and —4, depending on the particular promoter. 
This numbering convention has become standard. 

There are several mutations in the Pribnow box, two of which are 
shown in Figure 11-7, that prevent initiation of transcription. These 
mutations clearly indicate the importance of this sequence. Other bases 
outside of the Pribnow box are important too, as indicated by the other 
mutations shown in the figure. 



Site Selection: IIL The —35 Sequence 



Examination of the complete sequence of the region protected by RNA 
polymerase indicates that for many (but not all) promoters, there is a 
second important region, to the left of the Pribnow box, whose se- 
quences in different promoters have common features (Figure 11-8). 
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TT ATGCTTCCGGCTCG 
TCGCTTTGCTTCTGAC 
AAATACCACTGGCGGT 
TTTT ACCTCTGGCGGT 
CTTGTTTATTGCAGCT 
GCGGCGCGTCATTTGA 




-35 Sequence 



Pribnow box 



GTGG AATTG 
A C A G GGT A A 
GC A C AT C AG 
TTGC ATGT A 
TTAC AA ATA 
GCCCCGCTT 

mRNA 
start 



Figure 11-8 

Base sequences in the noncoding strand of six different 
RNA polymerase-protected regions showing the similar- 
ity between the -35 sequences. In each case, mutations 
that eliminate promoter activity have been found fn the 



-35 sequence. The vertical lines indicate the Hind II: 
cuts mentioned in the text. The Pribnow boxes 
rather than the mRNA start points are aligned. 



This sequence, which is called the -35 sequence and typically contains 
nine bases, is thought to be the initial site of binding of the enzyme. 
Evidence for this notion comes from the following experiment RNA 
polymerase is removed from the protected fragment and the fragment is 
purified. If fresh RNA polymerase is then added, binding will occur; 
indicating that the binding site is on the fragment. However, if the 
fragment is first treated with a restriction nuclease (Chapter 20) called 
Hindll , which makes a double-strand break at the sites indicated in the 
figure by the lines, RNA polymerase can no longer bind; presumably, 
the binding site is destroyed by the nuclease. Thus, RNA polymerase is 
thought to bind first at the leftmost side of the protected region and then 
to the Pribnow box. How it moves from one site to the next is not 
known. A theory that the enzyme "slides" along the DNA was popular at 
one time; it has not been ruled out but is considered to be unlikely. 
Another possibility, which has some experimental support, is that the a 
subunit binds first to a recognition site at the left in a highly specific 
interaction and then, owing to the great size of the enzyme, the 
appropriate region of the polymerase can come in contact with the 
Pribnow box region (Figure 11-9). Once bound to the Pribnow box, the 
polymerase then dissociates from the leftmost recognition site. 

The open-promoter complex is a highly stable complex and is the 
active intermediate in chain initiation. In this complex a local unwind- 
ing ("melting") of the DNA helix occurs starting about ten base pairs 
from the left end of the Pribnow box and extending to the position of the 
first transcribed base. This melting is necessary for pairing of the 
incoming ribonucleotides. The base composition of the Pribnow box 
sequence (A+T-rich) renders the DNA strand susceptible to denatura- 
tion. Presumably RNA polymerase itself induces this conformational 
change. 

The promoters discussed in this section are classified as high-level 
or strong promoters* There are also weak promoters in which recogni- 
tion by RNA polymerase is poor. The number of RNA molecules 
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(a) Template binding 



Figure 11-9 

A proposed scheme for 
the binding of RNA poly- 
merase to a promoter to 
form an open-promoter 
complex. Regions of the 
DNA molecule important 
for binding are shown in 
red. The shape of RNA 
polymerase is idealized 
for schematic purposes. 
The enzyme covers the 
region from bases -45 to 
+ 15, and the unpaired 
region in (c) extends from 
(roughly) -12 to +2. The 
enzyme Is shown in con- 
tact with both strands be- 
cause the strands are 
actually wrapped around 
one another in a helical 
array; however, true bind- 
ing occurs only to bases 
in the coding strand. 
"P.B." indicates the Prib- 
now box. 



RNA polymerase 




DNA 



(b) Dissociation of a subunit from - 35 sequence; 
movement to Pribnow box. 




(c) Establishment of open-promoter complex 



Open complex 




DNA 



is the 
wind- 
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synthesized per unit time from genes with weak promoters is much less 
than from a strong promoter with the result that fewer protein 
molecules are made per unit time by genes with weak promoters. 
Promoter strength is one factor which determines the number of copies 
of each protein molecule present in the cell. In most cases examined so 
far the difference between weak and strong promoters lies in the 
structure of the — 35 region. 



Site Selection: IV. The CAP Site 



Some promoters totally lack the common —35 sequence -for example, 
the Apre, galP, and araBAD promoters. These are active only in the 
presence of positive effector molecules (see Chapters 14 and; 16); for 
example, the Xpre promoter is active only when the A ell protein is 
present. The mechanisms of action of these effectors are not well un- 
derstood, though a study of the lac promoter suggests that they bind to a 
site in the - 50 to -30 region and, by a mechanism that differs from that 
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